doi: 10.17586/2226-1494-2024-24-6-982-990


Sentiment analysis of Arabic tweets using supervised machine learning (in English)

A. Benabdallah, M. Abderrahim, M. Mokri


Read the full article  ';
Article in English

For citation:
Benabdallah A., Abderrahim M.A., Mokri M. Sentiment analysis of Arabic tweets using supervised machine learning. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 6, pp. 982–990. doi: 10.17586/2226-1494-2024-24-6-982-990 


Abstract
The increasing volume of user-generated content on social media platforms necessitates effective tools for understanding public sentiment. This study presents an approach to sentiment analysis of Arabic tweets using supervised machine learning techniques. We explored the performance of three popular algorithms — Support Vector Machines (SVM), Naive Bayes (NB), and Logistic Regression (LR) — on two distinct corpora: the Arabic Sentiment Text Corpus (ASTC) and a dataset of Arabic tweets. Our methodology involved four tests assessing the impact of corpus characteristics, preprocessing techniques, weighting methods, and the use of N-grams on classification accuracy. The first test established that the choice of corpus significantly influences model performance, with SVM showing superior accuracy on the structured ASTC, while NB excelled with the informal Arabic tweets. In the second test, preprocessing steps, including the removal of punctuation and stop-words, led to a noticeable improvement in classification accuracy for the Arabic tweets but had minimal or even negative effects on the ASTC. The third test indicated that incorporating N-grams yielded modest improvements for NB and LR in more structured texts, while its impact on tweets was negligible. Finally, the fourth test compared different weighting techniques, revealing that SVM benefitted from the Term Frequency-Inverse Document Frequency weighting method, while NB performance remained stable regardless of the weighting approach. These findings underscore the importance of tailoring preprocessing and feature extraction strategies to the specific characteristics of the dataset, ultimately enhancing the accuracy of sentiment analysis in Arabic language contexts

Keywords: Arabic sentiment analysis (ASA), machine learning, classifier, polarity, Twitter

References
  1. Mataoui M., Zelmati O., Boumechache M. A Proposed lexicon-based sentiment analysis approach for the vernacular Algerian Arabic. Research in Computing Science, 2016, vol. 110, pp. 55–70. https://doi.org/10.13053/rcs-110-1-5
  2. Al-Kabi M., Gigieh A., Alsmadi I., Wahsheh H., Haidar M. An opinion analysis tool for colloquial and standard Arabic. Proc. of the fourth International Conference on Information and Communication Systems (ICICS 2013), 2013.
  3. Pang B., Lee L. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2008, vol. 2(1-2), pp. 1–135. https://doi.org/10.1561/1500000011
  4. Taboada M., Brooke J., Tofiloski M., Voll K., Stede M. Lexicon-based methods for sentiment analysis. Computational Linguistics, 2011, vol. 37, no. 2, pp. 267–307. https://doi.org/10.1162/coli_a_00049
  5. Ding X., Liu B., Yu P.S. A holistic lexicon-based approach to opinion mining. WSDM '08: Proc. of the 2008 International Conference on Web Search and Data Mining, 2008, pp. 231–240. https://doi.org/10.1145/1341531.1341561
  6. Kumar A., Sebastian T.M. Sentiment analysis on twitter. IJCSI International Journal of Computer Science Issues, 2012, vol. 9, no. 3, pp. 372–378.
  7. Klenner M., Petrakis S., Fahrni A. Robust compositional polarity classification. Proc. of the International Conference RANLP, 2009, pp. 180–184.
  8. Pak A., Paroubek P. Twitter as a corpus for sentiment analysis and opinion mining. Proc. of the Seventh International Conference on Language Resources and Evaluation (LREC'10), 2010.
  9. Al-Kabi M., Al-Ayyoub M., Alsmadi I., Wahsheh H. A prototype for a standard Arabic sentiment analysis corpus. International Arab Journal of Information Technology, 2016, vol. 13, no. 1A, pp. 163–170.
  10. Oueslati O., Cambria E., HajHmida M.B., Ounelli H. Review of sentiment analysis research in Arabic language. Future Generation Computer Systems, 2020, vol. 112, pp. 408–430. https://doi.org/10.1016/j.future.2020.05.034
  11. Ghallab A., Mohsen Y., Ali Y. Arabic sentiment analysis: A systematic literature review. Applied Computational Intelligence and Soft Computing, 2020, vol. 2020, pp. 403128. https://doi.org/10.1155/2020/7403128
  12. Duwairi R., Marji R., Sha'ban N., Rushaidat S. Sentiment Analysis in Arabic tweets. Proc. of the 2014 5th International Conference on Information and Communication Systems (ICICS), 2014, pp. 1–6. https://doi.org/10.1109/iacs.2014.6841964
  13. Bolbol N.K., Maghari A.Y. Sentiment analysis of Arabic tweets using supervised machine learning. Proc. of the 2020 International Conference on Promising Electronic Technologies (ICPET), 2020, pp. 89–93. https://doi.org/10.1109/ICPET51420.2020.00025
  14. Heikal M., Torki M., El-Makky N. Sentiment analysis of Arabic Tweets using deep learning. Procedia Computer Science, 2018, vol. 142, pp. 114–122. https://doi.org/10.1016/j.procs.2018.10.466
  15. Alhamid M., Alsahli S., Rawashdeh M., Alrashoud M. Detection and visualization of Arabic emotions on social emotion map. Proc. of the International Symposium on Multimedia (ISM), 2017, pp. 378–381. https://doi.org/10.1109/ISM.2017.76
  16. Al-Thubaity A., Alqahtani Q., Aljandal A. Sentiment lexicon for sentiment analysis of Saudi dialect tweets. Procedia Computer Science, 2018, vol. 142, pp. 301–307. https://doi.org/10.1016/j.procs.2018.10.494
  17. Assiri A., Emam A., Al-Dossari H. Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis. Journal of Information Science, 2018, vol. 44, no. 2, pp. 184–202. https://doi.org/10.1177/0165551516688143
  18. Alqurashi T. Arabic sentiment analysis for twitter data: A systematic literature review. Engineering, Technology & Applied Science Research, 2023, vol. 13, no. 2, pp. 10292–10300. https://doi.org/10.48084/etasr.5662
  19. Liu B. Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. 2nd ed. Cambridge University Press, 2020, 448 p.


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2025 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика